A Diffusion-Based Approach for Simulating Forward-in-Time State-Dependent Speciation and Extinction Dynamics

We establish a general framework using a diffusion approximation to simulate forward-in-time state counts or frequencies for cladogenetic state-dependent speciation-extinction (ClaSSE) models. We apply the framework to various two- and three-region geographic-state speciation-extinction (GeoSSE) models. We show that the species range state dynamics simulated under tree-based and diffusion-based processes are comparable. We derive a method to infer rate parameters that are compatible with given observed stationary state frequencies and obtain an analytical result to compute stationary state frequencies for a given set of rate parameters. We also describe a procedure to find the time to reach the stationary frequencies of a ClaSSE model using our diffusion-based approach, which we demonstrate using a worked example for a two-region GeoSSE model. Finally, we discuss how the diffusion framework can be applied to formalize relationships between evolutionary patterns and processes under state-dependent diversification scenarios.


Introduction
The branching events of a phylogenetic tree exhibit a pattern that stores information about the underlying speciation and extinction processes [29].In [29], they first considered a model where both speciation and extinction are treated as a constant-rate birth-death process by which lineages give birth to new lineages (speciation) at a rate λ and lineages die (extinction) at a rate µ.Speciation and extinction rates, however, are expected to vary idiosyncratically among phylogenetic lineages and over geological timescales.For example, [29] also considered another model in which speciation and extinction rates vary over time.Workers have designed birth-death models to study a variety of intrinsic and extrinsic factors that might shape diversification rates.Species age [12,1,32] and inherited traits [18,25,7,8,31] are two types of intrinsic factors thought to drive diversification rates, whereas environment [5,30] and geography [11,21,34] are common extrinsic factors of interest.In the end, a common goal of these models is to infer the underlying event rates given an observed phylogenetic pattern either through likelihood-based [27,33,24] or likelihood-free approaches [29,41,14,20,36].
Fundamentally, birth-death processes model the random arrival times of discrete events that generate or "build" a phylogenetic tree over time [29,25].As an alternative to this tree-based representation of the process, recent work [4] introduced an equivalent diffusion-based representation for a class of birth-death models with state-dependent rates, known as state-dependent speciation-extinction (SSE) models [25].As noted by [4], population genetics theory has benefited immensely from diffusion-based approximations to population-based models of allele frequency change, yet diffusion-based approximations of birth-death models remain underexplored in the phylogenetics literature.Despite the widespread popularity of birth-death models among evolutionary biologists, these models recently entered a phase of intense but overdue scrutiny to better understand what the models can and cannot estimate reliably when fitted to real biological datasets [23,28,39,6,19,22,37,3,35].This has created demand for new frameworks to understand the mathematical properties of these complex stochastic processes to guide biological research programs.
As mentioned above, applying diffusion processes in the macroevolutionary context is not new, and was recently applied by [4] to study the properties of the BiSSE [25] and QuaSSE [7] models.Our work begins by extending the diffusion-based BiSSE representation of [4] to a general multi-state SSE model that allows for both cladogenetic and anagenetic state changes, known as the ClaSSE model [10].We then show how our formulation may be used to determine the relationship between a set of SSE rates and their implied stationary state frequencies.Inverting this perspective, we show that our framework correctly delimits classes of SSE rate values that yield a given set of stationary frequencies.This establishes a many-to-one mapping of SSE rates on to stationary frequencies.After introducing our general framework for ClaSSE models, we apply it to a special geographical case of the ClaSSE model, known as the GeoSSE model [11].We choose the GeoSSE model because it possesses a complex but structured relationship among its parameters and its constituent events -i.e.dispersal, within-region speciation, between-region speciation, and local extinction -that impact lineages over evolutionary time.We then validate our theoretical results by simulating state frequency trajectories using both tree-based and diffusion-based simulators.
The rest of the paper is organized as follows.Firstly, in Section 2.1, we give a brief overview of SSE models in general.In Section 2.2 we visit relevant results in the theory of stochastic process, then in Section 2.3 we apply our framework to analyze the ClaSSE model, and later for the GeoSSE model with arbitrary number of regions in Section 2.4.Following these, in Sections 2.5 and 2.6 we present a method for simulating state dynamics under our framework and deriving rate parameters given stationary state frequencies.In Section 2.7, we derive a result to compute theoretical stationary state frequencies given rate parameters.Moreover, in Section 2.8, we describe a procedure to compute time to reach stationary frequencies in a 2-region GeoSSE system using results derived in Section 2.7.Furthermore, in Section 3.1, we show, through simulation examples, that our diffusion-based framework offers a good approximation for simulating range state dynamics when comparing to tree-based approach.In Section 3.2, using an example, we show the existence of alternative rate scenarios that lead to the same stationary state frequencies.Additionally, we apply results derived in Section 2.7 and Section 2.8 to that example in Section 3.2.Lastly, in Section 4, we summarize our results and discuss promising ways to study pattern-process relationships for data generated by SSE models, and ideas for future work using our framework.

Methods
This section describes the framework for how construct our diffusion approximation for a ClaSSE model to analyze the dynamics of states through time.Key results include derivations of the transition probabilities and the infinitesimal mean and variance parameters of the diffusion equation.We describe and implement the methods for simulating the evolution of state frequencies, and derive relevant results for the stationary conditions, focusing on two-and three-region GeoSSE models, which are special cases of the ClaSSE model.

Overview of state-dependent speciation and extinction models
In this section, we give a brief overview of SSE models by highlighting the key assumptions and different events occurring along lineages.Then, we briefly re-visit a particular SSE model type, the GeoSSE model [11].Then, we guide towards how to shift from tree-based perspective to non-treebased perspective to derive our object of interest.
In general, SSE models are stochastic branching processes with state-dependent birth (speciation) and death (extinction) rates.The states can either be discrete or continuous [25,7,8] and can represent various things, ranging from phenotypic traits to geographical ranges [11].Some SSE models have processes that are only defined by anagenetic process and state-dependent diversification process [25], while others have processes that are defined by both anagenetic and cladogenetic processes [11,10] shown in Fig. 1.An anagenetic process is defined as a process of trait evolution within lineages, between branching events.In the BiSSE model [25], this corresponds to trait transition events of going from a discrete trait A to another discrete trait B or vice versa.These trait-dependent transition rates are encoded in the infinitesimal rate matrix Q, for which the offdiagonal entry q ij defines the rate of transitioning from state i to j.A cladogenetic process is defined as a process in which state transition occurs in conjunction with a branching event (with speciation) of a lineage.SSE models with anagenetic and cladogenetic events are referred to as ClaSSE models.Part of this paper will consider a special case of the ClaSSE model, the GeoSSE model [11].A GeoSSE model describes how species move and evolve among a sets of discrete geographical regions, called species ranges.Species that occur in just one region are said to be endemic to that region.Species occurring in two or more regions are said to be widespread.GeoSSE events can be classified as anagenetic or cladogenetic events.Anagenetic events in GeoSSE include dispersal events and local extinction (sometimes called extirpation) events.Dispersal events add one region to a species range.Local extinction remove one region from a species range.A species experiences complete extinction (i.e. it is removed from the species pool) when it goes locally extinct in the last region in its range.Note that widespread species cannot experience complete extinction through a single event under a GeoSSE model; their widespread ranges must first be reduced to a single region before complete extinction is a possibility.
Cladogenetic events under GeoSSE include within-region speciation and between-region speciation events.Each within-region speciation event creates a new species within any single region of the parental species range.Each between-region speciation event causes a widespread parental species and its range to split, such that all regions in the parental range are distributed among the two new daughter lineages.Section 2.4 defines how GeoSSE assigns rates to different events.
Given a phylogeny with range state information as seen in Fig. 2, one can observe the dynamics of range states accumulated by species though time.In Section 2.2, we present the necessary theory that will later be used to allow us transitioning from a tree-based process to an alternative, diffusion-based process to simulate the dynamics.
Figure 2 An illustration of GeoSSE events on a phylogeny with range state information

Transforming a stochastic process
In this section, we briefly describe the relevant results in the theory of stochastic processes that enable us to transform one stochastic process into another stochastic process.In the context of the ClaSSE model described in Section 2.1, we want to define a process that simulates the (discrete) count of species with state i through time.This process can then be used to define a second process that simulates the (continuous) frequency of species with state i over time.

Theorem 1. Itô's transformation formula
Consider a stochastic process {Z(t)} with infinitesimal parameters µ(z) and σ 2 (z).Define a new stochastic process {Y (t)} with Y (t) = g(Z(t)) where g is a strictly monotone continuous and twice-differentiable function.Then, the new process {Y (t)} has infinitesimal parameters given by, Proof: This theorem is also known as Itô's formula or Itô's lemma.The proof is given in [15,17] .

Lemma 1.
Given a stochastic process {N i (t) := n i (t)} with infinitesimal mean and variance parameters µ i = E(dn i /dt) and σ 2 i = var(dn i /dt), respectively.Define a stochastic process {X(t)} derived using the following transformation.
where {N (t) := i n i (t)} is a stochastic process with infinitesimal parameters defined as follows, Note here we have used the fact that σ ij = 0 for i ̸ = j to account for independent birth-death processes.The infinitesimal mean and variance parameters for {X(t)} are given by, Proof: Proof of Lemma 1 is given in Appendix 7.1.

Diffusion-based framework for state-dependent diversification model
In this section, we establish the framework for simulating state dynamics for state-dependent speciation and extinction models using diffusion processes.We show how to implement the framework in the ClaSSE model introduced in [10].Then, we relate our framework to earlier research [4] using a diffusion process for the BiSSE model [25] and, later on, for the GeoSSE model [11].
Our first goal is to define the stochastic process {N i (t)}, which describes the number of species with state i ∈ S at time t, where S is the state space of the model.Then, using the method presented in Section 2.2, we can obtain the stochastic process {Π i (t)}, which describes the frequency of species with state i at time t.Using these two processes, we then derive results that directly link model parameters with stationary state frequency patterns that the model generates.
To proceed, we define the following probabilities: These probabilities correspond to gaining a new species in state i P + i , losing a species in state i P − i , and neither losing nor gaining a new species in state i (P i ) within an infinitesimal time step ∆t.
For the ClaSSE model, we can write those probabilities as follows, where S + i = Probability of events that lead to an increase in the number of species in state i through state-dependent speciation and speciation in conjunction with cladogenetic state change.
E + i = Probability of events that lead to an increase in the number of species in state i through extinction.
Q + i = Probability of events that lead to an increase in the number of species in state i through anagenetic state change.
S − i = Probability of events that lead to a decrease in the number of species in state i through state-dependent speciation and speciation in conjunction with cladogenetic state change.
E − i = Probability of events that lead to a decrease in the number of species in state i through extinction.
Q − i = Probability of events that lead to a decrease in the number of species in state i through Next, we define the infinitesimal mean µ i = E (dN i /dt) and variance σ 2 i = var (dN i /dt) for the stochastic process {N i (t) : t > 0}.

Lemma 2.
The infinitesimal mean µ i and variance σ 2 i for the stochastic process {N i (t) : t > 0} is given by Proof: Proof of Lemma is given in Appendix 7.2.
Next, we define a stochastic process {Π i (t) : t > 0} where Π i (t) denotes the frequency of species being in state i at time t.We define the infinitesimal mean and variance for the process in Lemma 3.

Lemma 3.
The infinitesimal mean µ Πi and variance σ 2 Πi for the stochastic process {Π i (t) : t > 0} is given by Proof: Proof of Lemma 3 is given in Appendix 7.3.
From Eqs. ( 9)- (10), it is clear that the diffusion parameters i.e. µ Πi , σ 2 Πi are undefined under a total extinction scenario of a tree (i.e.where N = 0 appears in multiple denominators).
To demonstrate the generality of the framework, we show the BiSSE model [25] (and similarly for the MuSSE model [8]) can be represented as a diffusion process as follows.Under the BiSSE model, species possess binary traits with values in the state space S = {1, 2}.BiSSE is a special case of the ClaSSE model that, while it allows anagenetic trait transition and extinction events, its speciation events do not cause cladogenetic trait changes.That is, daughter lineages identically inherit the parent lineage state following speciation.Readers can refer to the supplementary material from [10] for its derivation.For the BiSSE model, we have where λ 1 and µ 1 (b 1 and d 1 in [4]) are speciation and extinction rates for trait 1, respectively.q 12 and q 21 (τ 12 and τ 21 in [4]) are anagenetic trait transition from 1 to 2 and from 2 to 1, respectively.Similarly, following the definitions in Eq. ( 6), we also have Using Eq. ( 7) and Eq. ( 8) we have the infinitesimal mean and variance of N 1 , and similarly for N 2 with indices changed accordingly.These are the same µ 1 and σ 2 1 as described in Eq. (2) in [4].

Diffusion-based framework for the GeoSSE model
In this section, we use the framework established in Section 2.3 for general ClaSSE models to the GeoSSE model.The procedure we apply here is also compatible with any model from the ClaSSE family.For the GeoSSE model, unlike the BiSSE model described in Section 2.3, some speciation events also cause cladogenetic state changes.Thus, following the notation used in the previous section we have, where = Probability of events that lead to an increase in the number of species in range state i through within-region speciation for either widespread or endemic species.
B + i = Probability of events that lead to an increase in the number of species in range state i through between-region speciation for widespread species.
= Probability of events that lead to an increase in the number of species in range state i through extinction for either widespread species (local extinction) or endemic species (species extinction).
= Probability of events that lead to an increase in the number of species in range state i through range dispersal event for endemic species.
W − i = Probability of events that lead to a decrease in the number of species in range state i through within-region speciation for either widespread or endemic species.
B − i = Probability of events that lead to a decrease in the number of species in range state i through between-region speciation for widespread species.
Probability of events that lead to a decrease in the number of species in range state i through extinction for either widespread species (local extinction) or endemic species (species extinction).
Probability of events that lead to a decrease in the number of species in range state i through range dispersal event for endemic species.
Next, consider an n-region GeoSSE model where n ∈ Z + , we define the following state space and variable, R = state space for regions e.g., R = {A, B}. S = state space for species ranges e.g., S = {{A}, {B}, {A, B}} N i = number of species with range state i where i ∈ S.
Then, we define the following rate parameters, d kℓ = per lineage dispersal rate of any species in region k to colonize region ℓ.
w ℓ = per lineage within-region speciation rate of any species in region ℓ.
b i j = per lineage between-region speciation rate of a widespread species into.two daughter species with ranges i and j, respectively.Note that b i j ≡ b j i .e ℓ = local extinction rate of any species in region ℓ.
Thus, both w ℓ and b i j determine state-dependent speciation rate, e ℓ determines state-dependent extinction rate, and d kℓ and (among widespread species) e ℓ determine the anagenetic state transition rate.
We define a stochastic process {N i (t)} with infinitesimal mean µ i = E(dN i /dt) and variance σ 2 i = var(dN i /dt).Here, N i (t) represents the number of species with range state i at time t.The infinitesimal mean µ i and variance σ 2 i follow directly from Lemma 2. We derive the transition probabilities described in Eq. ( 4) in the context of the GeoSSE model, as shown in Eqs. ( 13)- (15).Each of these probabilities describe possible events in a GeoSSE model occurring within an infinitesimal time step that result in gaining a new species with range state i (P + i ), losing a species with range state i (P − i ), and neither losing nor gaining a species with range state i (P i ).
For clarity, we provide the biogeographic interpretation on how each term in Eqs.13-15 is derived and a graphical illustration of the events in Fig. 3.

W +
i .To gain a new species with range state i through a within-region speciation event, the new species range i must contain only region ℓ (ℓ ∈ i and |i| = 1).This endemic species can undergo a speciation event with probability w ℓ N i .Any species with range state j that also occupies region ℓ can undergo a within-region speciation event with probability w ℓ j∈S 1 i⊆j N j .The total probability of this event occurring within ∆t is, j∈S ℓ∈j {ℓ}=i N j w ℓ ∆t.
As an example, in a 2−region GeoSSE system with state space S = {{A}, {B}, {A, B}} we have, To gain a new species with range state i through a dispersal event, the species adds the new region ℓ to its ancestral range.Species are always widespread immediately following dispersal.The total probability of this event occurring within ∆t is, As an example, in a 2−region GeoSSE system with state space S = {{A}, {B}, {A, B}} we have, To gain a new species with range state i through a between-region speciation event, the new species can be either endemic or widespread |i| > 0 that originated from a widespread ancestral species with larger range state j (i ⊂ j).In general, we have no information of whether the new species occurs in left or right lineage following a speciation event, so we do not consider the orientation.The total probability of this event occurring within ∆t is, As an example, in a 2−region GeoSSE system with state space S = {{A}, {B}, {A, B}} we have,

E +
i .To gain a new species with range state i through a local extinction event, the ancestral species must have a larger range state j with size that differs by 1 from the new species' range state i such that |j\i| = 1.The total probability of this event occurring within ∆t is, As an example, in a 2−region GeoSSE system with state space S = {{A}, {B}, {A, B}} we have, The probability of losing a either endemic or widespread species with range state i through a within-region speciation event is 0. This is because the event will only increase the local abundance in a region and causes the widespread abundance to remain unchanged.

D −
i .To lose a species with range state i through a dispersal event, the species must disperse to a new region.The species count remains unchanged if the species already occupies all regions (|i| = |R|).The total probability of this event occurring within ∆t is, As an example, in a 2−region GeoSSE system with state space S = {{A}, {B}, {A, B}} we have, To lose a species with range state i through a between-region speciation event, the species must be widespread and undergo a speciation event that gives rise to a new species in state j with smaller range state size (|j| < |i|).The factor of 1/2 corrects for double-counting the new species with range j being either the left daughter or right daughter lineage.The total probability of this event occurring within ∆t is, As an example, in a 2−region GeoSSE system with state space S = {{A}, {B}, {A, B}} we have, To lose a species with range state i through a local extinction event, a species must undergo an extinction event in one of its regions.If the species is endemic, this event leads to total extinction of the species.The total probability of this event occurring within ∆t is, As an example, in a 2−region GeoSSE system with state space S = {{A}, {B}, {A, B}} we have, The next section uses Eqs.13-15 to define the stochastic process {Π i (t) : t > 0} that models the frequency of species in range state i at time t.The infinitesimal mean µ Πi and variance σ 2 Πi follow directly from Lemma 3.
Figure 3 Graphical illustrations of probabilities of events following Eq.(13) shown in (a), and Eq. ( 14) shown in (b) for a 2-region GeoSSE system with state space S = {{A}, {B}, {A, B}}.N i represents the number of species with range state i ∈ S.An incoming arrow into N i compartment means there is an increase in species count with range state i and an outgoing arrow from N i means there is a decrease in species count with range state i.All the events and arrows are color-coded accordingly.

Comparison on diffusion-based and tree-based models using simulation
In this section we show that our diffusion-based approach correctly models the temporal behaviour of range state frequencies in a GeoSSE model.To validate, we compare our results with a treebased approach that explicitly simulates phylogenetic trees under the same GeoSSE parameter values using the MASTER package [40] implemented in BEAST2 [2].When simulating given a large number of species initially, N (0) >> 0, both diffusion-based and MASTER-based simulations are conditioned only for the process to run until a specific elapsed time T .Later in Appendix 7.6, when we simulate using both approaches starting with a single species in random state, N (0) = 1, we condition the process under both elapsed time and survival until the present.Details for setting up reaction equations for the MASTER simulation can be found in Appendix 7.5.
For simulations under a diffusion, we generate sample paths on [0, T ], where T is the simulation running time.Each simulation yields a time-series of state frequencies for the provided SSE rate values.Simulations were generated as follows: 1. Given the following Itô stochastic differential equation (SDE) and the initial number of species in each range state, N i (0), ∀i ∈ S, where dW t is a Wiener process, we draw a sample path by using the following approximation, where √ ∆tU t ∼ √ ∆tN (0, 1) is a (discretized) standard Wiener process, and µ i (t) and σ i (t) are computed using Eqs ( 7)-( 8), respectively.
2. Given N i (t + ∆t) for each i ∈ S from step 1, we compute the total number of species at 3. Next, using N i (t) and N (t) from steps 1-2, we compute the infinitesimal mean, µ Πi (t), and infinitesimal variance, σ Πi (t) using Eqs.( 9)- (10), respectively.Given µ Πi (t), σ Πi (t), and the following Itô SDE with the initial frequency of species of range state i, Π i (0) = Ni(0) N (0) , where dW t is a Wiener process, we draw a sample path by using the following approximation, where √ ∆tU t ∼ √ ∆tN (0, 1) is a (discretized) standard Wiener process.
In Section 3.1, we show that the dynamic of the range state frequencies can be well-approximated using the diffusion-based framework.We provide different examples through numerical simulations under a variety of GeoSSE scenarios to visualize this result.Specifically, we apply the following procedure, 1. We consider a 3-region GeoSSE model, then we simulate range state dynamics using treebased approach (via the MASTER package in BEAST2) and the diffusion-based approach over 1000 replicates on [0, 10] time interval with 1000 time steps.Note that if one simulates over a longer time interval, then one needs to choose larger time steps to reduce the chance that multiple events occur within ∆t for the diffusion-based approach.For diffusion-based approach, at each time step, we assign a zero value to any state with a count less than zero since the number of species in any range states cannot be negative.This is reasonable because if N i (t) = 0, then some events are not permitted such as a local extinction.Note that although some N i 's might be equal to 0, it is very unlikely for the whole clade to become extinct, i.e., N (t) = 0, given a relatively large clade size at the beginning of each process (Fig. 4) and value of each parameter we pick for the simulations (Figs.5-8).We consider the following scenarios for the GeoSSE model, Example 1. GeoSSE model with only within-region speciation and between-region speciation events (Fig. 5).
Example 2. GeoSSE model with only within-region speciation and range dispersal events (Fig. 6).
Example 3. GeoSSE model with only within-region speciation and local extinction events (Fig. 7).
Figure 4 For each diffusion-based and MASTER-based simulation, we assume that we start each N i (t) simulation, given a relatively large clade size at the beginning, N (0) >> 0 2. We visualize the trajectory of mean state counts for each range state from both diffusion and tree-based approaches.For each simulation, we start the forward-in-time simulation given relatively large clade size for diffusion-based approach to accurately predict the dynamics given by tree-based approach from MASTER simulations.We also visualize stacked bar charts of expected state frequencies for both approaches.To compute the state frequencies under the tree-based approach across replicates, we use the following analytical formula .
We simulate frequency trajectories under the diffusion-based approach using Eq. ( 19).Also for diffusion-based approach, we normalize Π i (t) at each time step for each i ∈ S. Thus, keeping Π i (t) ≤ 1 at any time.
3. We find the 95% confidence intervals of expected state counts at the end time for both diffusion and tree-based simulations for each GeoSSE scenario described above.Then, we apply the Welch's unequal variances t-test [42] for testing the following hypothesis H 0 : μNi,tree = μNi,diffusion where μNi,tree and μNi,diffusion are population means of state counts for range i at the end time from tree and diffusion-based approaches, respectively.
4. We also conduct the F test for testing the following hypothesis Ni,diffusion where σ2 Ni,tree and σ2 Ni,diffusion are population variances of state counts for range i at the end time from tree and diffusion-based approaches, respectively.5. We compute ratio of two sample variances for range state i as , where s 2 i,diffusion and s 2 i,tree are sample variances from diffusion-and tree-based simulations for range state i, respectively.Then, we construct the 95% confidence interval for r i,var .
If the diffusion-based and tree-based simulation methods are statistically indistinguishable, we should fail to reject all null hypotheses and that the confidence intervals of the ratios of variances include the value 1 at the appropriate significance levels.
While all the diffusion-based simulations presented in the main text assume that we always start with a relatively large clade size, this is not how phylogenetic trees are normally simulated.Instead, most simulations generate the entire clade, beginning with one stem or two sister lineages to represent the origin of the process.However, the diffusion approximation assumes the number of species is large.Therefore, to adapt our diffusion-based model for clade-generation scenarios where the initial number of species is small, we adapted our diffusion-based simulation method to start the process with a single species in a random state (see Appendix 7.6).We show that the difference between diffusion-based and tree-based simulations is reduced after applying the correction.
2.6 Deriving rate parameters that lead to stationary state frequencies when N is large In this section, we derive conditions for the rate parameters such that there is no change in state frequency, Π i , over time for a given a range state i ∈ S, assuming large N .That is, we derive the conditions when dΠi dt = 0, ∀i ∈ S.
Furthermore, we assume all rate parameters must be positive, as all modeled events have some non-zero probability of occurring.That is, Next, we define total rates of all events occurring in each range state i, Φ total,i , as follows consist of sums of rates across all adjacent states that correspond to the events . Φ total,i can also be thought as a flux for range state i.That is, it is a difference between total incoming rates and outgoing rates.For example, in a two-region GeoSSE model, we can define Φ total,{A} as follows, where we have r W + {A} = 2w A because within-region speciation rate w A is acting on both endemic species with state {A} and widespread species with state {A, B}.

Lemma 4.
Given a GeoSSE with state space S, set of stationary frequencies, { Πi , ∀i ∈ S}, and initial state frequencies Π i (0), the rate parameters satisfy the following system of equations In Section 3.2, we demonstrate the application of Lemma 4 for a 2-region GeoSSE model.

Deriving stationary state frequencies given rate parameters in a GeoSSE model
In this section, we use our framework to find the stationary state frequencies that result from a given set of rate parameters.This result links the configuration of a data-generating process to its expected pattern, which complements results from Section 2.6 that link expected patterns to data-generating processes.We present the result in Lemma 5 for the case of a 2-region GeoSSE model for simplicity.

Lemma 5.
Consider a 2-region GeoSSE model with state space S = {{A}, {B}, {A, B}}.Given the rate parameters from the model and initial state frequencies, AB , the general solution to Eq. ( 26) is given by, and Furthermore, the stationary frequencies are given by Π{A,B} = 1 − Π{A} − Π{B} , where Proof: Proof of Lemma 5 is given in Appendix 7.4.
We note that this strategy can be generalized to accommodate arbitrary models within the ClaSSE family.Specifically, as seen in the proof of Lemma 5 in Appendix 7.4, for a ClaSSE model with |S| states, one only needs to find eigenvalues (either numerically or analytically) and eigenvectors that correspond to a (|S| − 1) × (|S| − 1) matrix to obtain a general solution.The resulting solution for the stationary frequencies would then reflect the parameterization of the particular ClaSSE model variant being studied.Note that this approach of solving a matrix with one dimension lower than the state space only holds providing that the sum of the remaining frequencies is less than or equal to 1.This assumption, however, can be ignored if one is to solve the full system by finding eigenvalues and eigenvectors that correspond to a |S| × |S| matrix, and normalize the resulting stationary frequencies.
In Section 3.2, we use Lemma 5 using rates obtained from Lemma 4 to verify that the system, indeed, converges to the true stationary frequencies that we observe through simulations.

Deriving time to reach stationary state frequencies in a GeoSSE model
In this section, we describe a method for deriving time to reach stationary state frequencies in a 2-region GeoSSE model.Note that we have assumed a relatively large clade size at the start of the process for simulating Π i (t).Thus, the following is time to stationary frequencies since some relatively large clade size (Fig. 4).
From Lemma 5 in Section 2.7, we have derived an analytical expression to compute state frequencies over time, given large N .In order to find the time to stationarity for each range state, we define the following procedure, as follows 1.Given the initial state frequencies, Π 0 A , Π 0 B , Π 0 AB , and that the system runs from [0, T ], we find the mixing time t * i for all i ∈ S such that, for some ∆t > 0 and ϵ > 0. t * i is the stationary time for the range state i, given the ϵ value.
2. We visually check that t * i derived from the theory reconciles with what we observe from simulations.
We apply this procedure to an example in Section 3.2.

Diffusion-based approach is a good approximation to tree-based approach for describing state dynamics
In this section, we visualize the range state dynamics using tree-based and diffusion-based approaches under several GeoSSE scenarios described in Section 2.5 (Figs.5-8).In all these scenarios, we show that the null hypothesis that the average counts of the ranges states at the end of the simulation time between these approaches are equal cannot be rejected (Table 1).This shows that the diffusion-based approach is a good approximation for means to the tree-based approach.
In most cases, we observe that data (state counts and frequencies) simulated under diffusionbased approach relatively have higher variances compared to data simulated under tree-based approach (Table 1).The 95% confidence interval for the ratio of two variances, shown in Table 1, gives an interval estimate on how much variation one would expect to get for generating state patterns under the diffusion process.Moreover, assuming that data simulated using the MASTER package [40] represent the true distribution of range state counts, this observation implies that diffusion process is not a good approximation for the second moment of the sampled state state frequencies.While this is not ideal, this is to be expected since diffusion is an approximation method to a generative model.Therefore, we should not expect state counts from both approaches to be drawn from the same distribution.We find a set of rate parameters and initial state frequencies that give the following stationary range state frequencies, That is, by Eq. ( 27), we have, We found a set of solutions to Eq. (33).That is, Another set of solutions is given by, 3 , E( Π{B} ) → 1 3 , E( Π{A,B} ) → 1 3 .Using Lemma 5, we confirm that these expected stationary frequencies from simulations converge to the theoretical, and true stationary frequencies given these sets of rates.Furthermore, using the procedure described in Section 2.8 with ϵ = 10 −9 , we found that the stationary frequencies are reached at:

Comparing our method of computing stationary state frequencies with existing literature
In this section, we compare our method for computing stationary state frequencies from rate parameters introduced in Section 2.7 with another method used in diversitree package [8] for the ClaSSE [10] and GeoSSE [11] models.Although the technique used in diversitree has not been discussed in any SSE papers, such as the papers introducing the MuSSE [8], ClaSSE [10], and GeoSSE [11] models, the technique applies projection matrix models that are widely used in the context of population biology to obtain ClaSSE and GeoSSE stationary frequencies (pers.comm.E. E. Goldberg and R. FitzJohn).Originally developed for applications in discrete-time models with either size-structured or age-structured population [38], this approach has also been adapted for continuous-time models with the latter structured population [16].Under this approach, one would create a square matrix with entries that map the state of a structured population from one time to the next.Then, the dominant eigenvalue of such matrix represents the overall population growth rate with its eigenvector represents the stable stage distribution [38].
Through examples below we find that our method returns similar state frequencies to those computed under the projection matrix model in diversitree package [8].For example, under the following rate parameters in a two-region GeoSSE model,

Discussion and Conclusion
In our paper, we have constructed a general framework using diffusion processes to study state dynamics over time from a general state-dependent speciation and extinction model with both anagenetic and cladogenetic state transitions, making it suitable for studying members of the ClaSSE model family [10,26,11,9].We have applied this framework under various diversification scenarios for the GeoSSE model [11], a special case of the ClaSSE model, as described in Sections 2.4-2.5.
Our framework can easily be applied to other discrete state-dependent diversification models, such as simpler BiSSE and MuSSE models [25,8] and Markovian Binary Tree (MBT) models [18,13,31].Through simulations and statistical analyses, we have shown that state dynamics simulated under diffusion-based approach and tree-based approach are comparable, given that we start the simulations with relative large clade size (Figs.5-8, Table 1).We also obtain good agreement between diffusion-based and tree-based simulations when beginning the process with a single species in random state, after applying a model-based correction procedure (Appendix 7.6, Fig. 12).We also show, using a statistical test, that our diffusion framework offers a good approximation for the mean of state counts.This result allows one to understand how data generating process i.e.rate parameters from a diversification model can explain observed state patterns without using phylogenetic information.For inferring rates using empirical state data at present, this diffusion-based approach to simulate state dynamics could be treated as a way to validate whether rates estimated from biological datasets using phylogenetic methods are sensible.
Moreover, in Sections 2.6-2.7,we have derived theoretical results to deduce the expected state frequencies generated by a set of rates, and what possible rates will generate a given set of expected state frequencies.These results are generalizable to accommodate a system having more states, and provide an alternative way to validate the correctness of SSE simulation and inference methods.Additionally, in Section 2.8, we described a procedure to compute the minimum time for an SSE process to reach stationarity in its state frequencies.We have applied these results for a 2-region GeoSSE model.As seen in Figs.9-10, we showed that there exist multiple different rate scenarios that can lead to the same stationary behaviour of state pattern.Our framework also creates an alternative mathematical approach to tree-based models that could help establish conditions for which SSE model parameters are and are not identifiable.
We next plan to study the time for perturbed SSE models to reach stationarity.This would help biologists understand how evolutionary systems re-equilibrate and how long that re-equilibration takes following perturbation.In particular, we plan to apply this framework to study scenarios where SSE rates shift across time [5,30].Scenarios with time-heterogeneous rates are particularly interesting for GeoSSE model variants, mainly because regions experience changes in their features (e.g., region size, distance with nearby regions, separation types) over time.As studied in [21,34], paleogeographically-changing regional features should influence rates of speciation, extinction, and dispersal over time.Mathematical knowledge of expected state (range) frequencies for arbitrary biogeographical systems could help biodiversity researchers assess whether certain clades of regions are within or between states of equilibrium.From Eq. (1), we compute g ′ (N ) and g ′′ (N ),

Funding
Similarly, we have, Now applying Theorem (1) to Eq. ( 1), we have, By definition of the first-order and second-order moments we have, E(N i (t + ∆t)) = (N i + 1)P + i ∆t + (N i − 1)P − i ∆t + (N i )P i ∆t = N i P + i + P − i + P i ∆t + P + i − P − i ∆t = N i + P + i − P − i ∆t, E(N 2 i (t + ∆t) = (N i + 1) 2 P + i ∆t + (N i − 1) 2 P − i ∆t + (N i ) 2 P i ∆t. Thus,

Figure 1
Figure 1 From left to right: a speciation event without cladogenetic state changes, a speciation event with cladogenetic state changes, an anagenetic state change

For i = Figure 11 Figure 12
Figure 11 The trajectories of mean counts of range states for endemic species (a-c) and widespread species (d-g) over the [0, 10] time interval.Trajectories were simulated under the diffusion-based process (red line) and tree-based process (black line), starting with 1 species in a random state.For each starting state, we simulate 150 trajectories (1050 trajectories in total) for each approach.The gray trajectories show the dynamics across 1050 replicates simulated under diffusion-based process.Simulations are conducted using the following parameter values: w A = 0.36, w B = 0.24, w C = 0.28, b A B = b A C = b B C = b A BC = b B AC = b C AB = 0.16, e A = 0.02, e B = 0.03, e C = 0.01, d AB = d BA = 0.12, d AC = d CA = 0.06, d BC = d CB = 0.02

Table 1
The sample mean count for each range state at the end of simulation time, Ni,end , computed under tree-based and diffusion-based simulations across different GeoSSE scenarios described in Section 2.5.The "Lower bound" and "Upper bound" represent the 95% confidence interval of the average count for each range state using diffusion and tree based approaches.The "95% CI r i,var " correspond to the 95% confidence interval of the ratio of two sample variances from diffusion and tree based approaches for range state i. p var and p mean correspond to p value from the F test and the Welch's unequal variances t-test, respectively3.2Multipleratescenarioslead to the same stationary state frequenciesWe apply the theoretical results from Sections 2.6-2.8 for a 2-region GeoSSE model.The different sets of relationships between rate parameters given stationary frequencies in Example 5 are derived using Mathematica[43].In this example, we show that there exist alternative rate scenarios leading to the same stationary frequencies.Furthermore, using Lemma 5, we confirm that the stationary frequencies observed from simulations converge to the theoretical frequencies given the rate parameters, which are derived using Lemma 4. Using the procedure described in Section 2.8, we compute time to stationary frequencies in Example 5 for each rate scenario and different sets of initial frequencies.Example 5. We consider a 2-region GeoSSE model with range state space S = {{A}, {B}, {A, B}}.